Pierre
Predicting Delayed Trajectories Using Network Features: A Study on the Dutch Railway Network
Kampere, Merel, Alsahag, Ali Mohammed Mansoor
The Dutch railway network is one of the busiest in the world, with delays being a prominent concern for the principal passenger railway operator NS. This research addresses a gap in delay prediction studies within the Dutch railway network by employing an XGBoost Classifier with a focus on topological features. Current research predominantly emphasizes short-term predictions and neglects the broader network-wide patterns essential for mitigating ripple effects. This research implements and improves an existing methodology, originally designed to forecast the evolution of the fast-changing US air network, to predict delays in the Dutch Railways. By integrating Node Centrality Measures and comparing multiple classifiers like RandomForest, DecisionTree, GradientBoosting, AdaBoost, and LogisticRegression, the goal is to predict delayed trajectories. However, the results reveal limited performance, especially in non-simultaneous testing scenarios, suggesting the necessity for more context-specific adaptations. Regardless, this research contributes to the understanding of transportation network evaluation and proposes future directions for developing more robust predictive models for delays.
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- Europe > Netherlands > North Holland > Amsterdam (0.05)
- Europe > Netherlands > South Holland > Leiden (0.04)
- (35 more...)
Geological Inference from Textual Data using Word Embeddings
Linphrachaya, Nanmanas, Gómez-Méndez, Irving, Siripatana, Adil
This research explores the use of Natural Language Processing (NLP) techniques to locate geological resources, with a specific focus on industrial minerals. By using word embeddings trained with the GloVe model, we extract semantic relationships between target keywords and a corpus of geological texts. The text is filtered to retain only words with geographical significance, such as city names, which are then ranked by their cosine similarity to the target keyword. Dimensional reduction techniques, including Principal Component Analysis (PCA), Autoencoder, Variational Autoencoder (VAE), and VAE with Long Short-Term Memory (VAE-LSTM), are applied to enhance feature extraction and improve the accuracy of semantic relations. For benchmarking, we calculate the proximity between the ten cities most semantically related to the target keyword and identified mine locations using the haversine equation. The results demonstrate that combining NLP with dimensional reduction techniques provides meaningful insights into the spatial distribution of natural resources. Although the result shows to be in the same region as the supposed location, the accuracy has room for improvement.
- Europe > United Kingdom (0.05)
- Asia > Indonesia > Java > Jakarta > Jakarta (0.05)
- North America > Canada > British Columbia (0.04)
- (32 more...)
- Energy (0.94)
- Materials > Metals & Mining > Lithium (0.50)
Testing GPT-4 with Wolfram Alpha and Code Interpreter plug-ins on math and science problems
Davis, Ernest, Aaronson, Scott
Our test sets were too small and too haphazard to support statistically valid conclusions, but they were suggestive of a number of conclusions. We summarize these here, and discuss them at greater length in section 7. Over the kinds of problems tested, GPT-4 with either plug-in is significantly stronger than GPT-4 by itself, or, almost certainly, than any AI that existed a year ago. However it is still far from reliable; it often outputs a wrong answer or fails to output any answer. In terms of overall score, we would judge that these systems performs on the level of a middling undergraduate student. However, their capacities and weaknesses do not align with a human student; the systems solve some problems that even capable students would find challenging, whereas they fail on some problems that even middling high school students would find easy.
- North America > United States > Michigan (0.04)
- North America > United States > Illinois > Cook County > Chicago (0.04)
- North America > Canada > Quebec (0.04)
- (40 more...)
GPT-4 Can't Reason
GPT-4 was released in March 2023 to wide acclaim, marking a very substantial improvement across the board over GPT-3.5 (OpenAI's previously best model, which had powered the initial release of ChatGPT). However, despite the genuinely impressive improvement, there are good reasons to be highly skeptical of GPT-4's ability to reason. This position paper discusses the nature of reasoning; criticizes the current formulation of reasoning problems in the NLP community, as well as the way in which LLM reasoning performance is currently evaluated; introduces a small collection of 21 diverse reasoning problems; and performs a detailed qualitative evaluation of GPT-4's performance on those problems. Based on this analysis, the paper concludes that, despite its occasional flashes of analytical brilliance, GPT-4 at present is utterly incapable of reasoning.
- North America > United States > Texas (0.04)
- North America > United States > Massachusetts > Suffolk County > Boston (0.04)
- North America > United States > South Dakota > Hughes County > Pierre (0.04)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
- Research Report (1.00)
- Personal > Interview (0.92)
- Transportation (0.92)
- Health & Medicine (0.92)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.34)
When Not to Trust Language Models: Investigating Effectiveness of Parametric and Non-Parametric Memories
Mallen, Alex, Asai, Akari, Zhong, Victor, Das, Rajarshi, Khashabi, Daniel, Hajishirzi, Hannaneh
Despite their impressive performance on diverse tasks, large language models (LMs) still struggle with tasks requiring rich world knowledge, implying the limitations of relying solely on their parameters to encode a wealth of world knowledge. This paper aims to understand LMs' strengths and limitations in memorizing factual knowledge, by conducting large-scale knowledge probing experiments of 10 models and 4 augmentation methods on PopQA, our new open-domain QA dataset with 14k questions. We find that LMs struggle with less popular factual knowledge, and that scaling fails to appreciably improve memorization of factual knowledge in the long tail. We then show that retrieval-augmented LMs largely outperform orders of magnitude larger LMs, while unassisted LMs remain competitive in questions about high-popularity entities. Based on those findings, we devise a simple, yet effective, method for powerful and efficient retrieval-augmented LMs, which retrieves non-parametric memories only when necessary. Experimental results show that this significantly improves models' performance while reducing the inference costs.
- North America > United States > Louisiana > East Baton Rouge Parish > Baton Rouge (0.04)
- North America > Canada (0.04)
- Europe > Kosovo > District of Gjakova > Rahovec (0.04)
- (6 more...)
- Media > Film (1.00)
- Leisure & Entertainment (1.00)
Wikipedia, "Jeopardy!," and the Fate of the Fact
Is it still cool to memorize a lot of stuff? Is there even a reason to memorize anything? Having a lot of information in your head was maybe never cool in the sexy-cool sense, more in the geeky-cool or class-brainiac sense. But people respected the ability to rattle off the names of all the state capitals, or to recite the periodic table. It was like the ability to dunk, or to play the piano by ear--something the average person can't do.
- North America > United States > South Dakota > Hughes County > Pierre (0.04)
- North America > United States > New York (0.04)
- Europe > United Kingdom > Wales (0.04)
- Media (0.70)
- Government > Regional Government > North America Government > United States Government (0.34)
- Information Technology > Communications > Social Media (0.94)
- Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.30)
Queen's Speech: Government to announce plans for commercial space flights and ports for spaceships
Powers planned by the Government aiming to pave the way for commercial space flights in Britain will be included in the Queen's Speech alongside a raft of investments in transport infrastructure. The legislation, according to Department for Transport (DfT), will allow the launch of satellites from the UK for the first time, horizontal flights to the edge of space for scientific experiments and the establishment of spaceports in regions across Britain. The Queen's Speech, which has been delayed by two days due to the current instability in British politics, will also include measures to improve conditions for the 100,000 drivers of plug-in vehicles by "removing barriers that are preventing more drivers switching to electric". "As things stand, those wanting to use publicly-accessible charging points may need to register with several different companies that run them," the Department for Transport added. "The planned legislation will include measures to ensure drivers need register only once to make full use of the existing infrastructure."
- Europe > United Kingdom (1.00)
- North America > The Bahamas (0.16)
- North America > Panama (0.15)
- (17 more...)